Jell: Parser Generator Using other features
This section describes features jell provides to deal with both these problems.
One type of ambiguity is when there are two choices to continue parsing a rule, and the first token that can appear in each choice is the same. For instance, if we want a grammar to parse both declarations and assignment statements, and we write this set of rules for it.
stmt ::= assignment | typeDecl ; typeDecl ::= ID varList SEMI ; varList ::= ID (COMMA ID) * ; assignment ::= ID ASSIGN NUMBER;Jell will give this warning for the grammar
Warning: or/or ambiguous for stmt on seeing { ID }
Both typeDecl and assignment
start with an ID, so with one token of lookahead jell is unable
to decide which rule to use. Jell will generate a parser anyway, and
it will arbitrarily pick the first rule (in this case assignment)
if it encounters an ID in its input.Jell provides a way to embed directives to control the parsing for such cases through the %if{ directive, and also provides a peek() method that lets you peek into tokens further down the stream.
stmt ::=
(%if{ peek(2).type() == token.ASSIGN %} assignment) |
typeDecl ;
The code within the %if{ .. %} is taken as a boolean
expression and added as an extra conditional in the generated
parser. When an ID is seen on the input, the parser also
applies the conditional to decide which of the two rules to pick.Another source of ambiguity is of the dangling else type.
stmt ::= IF expr THEN stmt (ELSE stmt)?
|
OTHER ;
On processing this grammar, jell generates the warning
Warning: ``?'' choices for stmt ambiguous on { ELSE }
The parser can either choose to reduce the else part to empty or it
can try to expand the ELSE part of the rule. Jell will
generate a parser that will always enter the ELSE portion,
which corresponds to matching up else's with the closest if's, and is
similar to how LR parsers deal with such ambiguities.Similar ambiguities can arise for * or + choices, and in all cases, jell will generate a parser that will try to match as much of the input as possible before returning from the choices.
1 - (-2 3 + (-4 - - 5 )It gives the following error messages
Line 1: 3: Syntax error, expected { PLUS MINUS MULT DIV RPAREN EOF }
Line 1: PLUS: restarting
Line 2: EOF: Syntax error, expected { RPAREN }
Line 2: EOF: restarting
Inserting RPAREN
On detecting an error, jell first reports the location of the error
and the local set of tokens that would have made sense for the parser.
Next, it starts skipping tokens (maybe none) until it find one it
can use to restart the parse. This new location is reported. Finally,
if any tokens need to be inserted in order to "resynchronize" the parse,
these are added and also reported.All errors ultimately percolate down to the method int error(String s) which can be overridden if you wish to keep track of any statistics and so on. For finer grained control of error tracking, there are three methods that are used to handle each type of message generated by jell.
Syntax errors are first handed to the method void syntaxError(token cur_token, int expectedTokens[]). The integer array contains the set of token id's that were expected to be present.
Restart messages are handed to the method void restartMessage(token restartToken) which is given the token where the parse will resume.
Any tokens that are inserted are passed to the method void insertMessage(int insertedTokenId) which is given the id (the constant in the token interface) of the inserted token.
The strategy employed by jell to recover from errors isn't particularly smart, but does reasonably well on grammars for expressions and common language constructs like it-then-else or while statements.
Actions are not executed while skipping tokens, but a value returned from a rule may potentially be uninitialized because the parse failed for that portion of the rule. Therefore, remember to always initialize any values that can be returned from a rule (the java compiler also will warn about any uninitialized variables it detects) and always remember to check if a valid value has been returned from a rule.
KB Sriram
Comments, bug reports: kbs@sbktech.org
Revised: Sat May 25 10:45:44 1996
URL: http://www.sbktech.org/jell-ex2.html